Efficiently Processing of Top-k Typicality Query for Structured Data
نویسندگان
چکیده
This work presents a novel ranking scheme for structured data. We show how to apply the notion of typicality analysis from cognitive science and how to use this notion to formulate the problem of ranking data with categorical attributes. First, we formalize the typicality query model for relational databases. We adopt Pearson correlation coefficient to quantify the extent of the typicality of an object. The correlation coefficient estimates the extent of statistical relationships between two variables based on the patterns of occurrences and absences of their values. Second, we develop a top-k query processing method for efficient computation. TPFilter prunes unpromising objects based on tight upper bounds and selectively joins tuples of highest typicality score. Our methods efficiently prune unpromising objects based on upper bounds. Experimental results show our approach is promising for real data.
منابع مشابه
Efficiently Answering Top-k Typicality Queries on Large Databases
Finding typical instances is an effective approach to understand and analyze large data sets. In this paper, we apply the idea of typicality analysis from psychology and cognition science to database query answering, and study the novel problem of answering top-k typicality queries. We model typicality in large data sets systematically. To answer questions like “Who are the top-k most typical N...
متن کاملTopX: efficient and versatile top-k query processing for text, structured, and semistructured data
TopX is a top-k retrieval engine for text and XML data. Unlike Boolean engines, it stops query processing as soon as it can safely determine the k top-ranked result objects according to a monotonous score aggregation function with respect to a multidimensional query. The main contributions of the thesis unfold into four main points, confirmed by previous publications at international conference...
متن کاملOverview of Top-k Query Processing in Relational Databases
Query processing is a fundamental part of Database management system. As the amount of text data stored in relational databases is increasing, it is necessary to support the Top-k query processing over text data. The main objective of top-k query processing is to return the k highest ranked results quickly and efficiently. In this paper, we introduce the Top-k query processing in relational dat...
متن کاملKeyword Search over Graph-structured Data for Finding Effective and Non-redundant Answers
In this paper, we propose a new method for keyword search over large graph-structured data to find a set of answers which are not only relevant to the query but also reduced and duplication-free. We define an effective answer structure and a relevance measure for the candidate answers to a keyword query on graph data. We suggest an efficient indexing scheme on relevant and useful paths from nod...
متن کاملGuest Editors Introduction: Special Section on Keyword Search on Structured Data
WITH the prevalence of Web search engines, keyword search has become the most popular way for users to retrieve information from text documents. On the other hand, there is an enormous amount of valuable information stored in structured form (relational or semistructured) in Internet, intranet, and enterprise databases. To query such data sources, users traditionally depended on specialized app...
متن کامل